pacman::p_load(tidyverse, sf, sfdep, tmap, mapview, plotly, knitr)Take-home Exercise 1: Geospatial Analytics for Public Good
1 Introduction
1.1 Background
Buses and the Mass Rapid Transit (MRT) system are the main modes of transport used by Singaporeans for their daily commutes. With an ever increasing population, the key challenge for public bus operators is balancing supply against demand by optimising its services, fleets, and manpower for different bus routes.
Hence, the identification and analysis of the movement patterns of public buses can provide insights on commuting in Singapore, allowing for the development of better urban management strategies by both the public and private sectors.
1.2 The Study Area
The study area is Singapore, a city-state in Southeast Asia, with a total land area of about 730 square kilometres and a total population of 5.92 million (as at June 2023).
Commuting by public bus is the most common mode of transport in Singapore, with an average daily ridership of about 3.461 million in 2022, according to the LTA. The public bus fleet is approximately 5,800, plying more than 300 routes and 5,000 bus stops.
1.3 The Analytical Questions
In this take-home exercise, I aim to answer the following analytical questions:
Is the demand for public bus transport evenly distributed geographically?
If no, are there signs of spatial clustering?
If yes, where are these clusters, and what kind of clusters are they?
Based on the analysis, what are the insights that can be derived for urban transport planning?
1.4 Objectives and Tasks
Hence, the objectives of this take-home exercise are to:
Apply Exploratory Spatial Data Analysis (ESDA) to obtain a preliminary understanding of public bus movements in Singapore; and
Apply appropriate Local Indicators of Spatial Association (LISA) Analysis or Emerging Hot Spot Analysis (EHSA) to undercover the spatial and/or spatio-temporal mobility patterns of public bus passengers in Singapore.
The detailed tasks are:
Geovisualisation and Analysis:
Compute passenger trips generated by origin at the hexagon level for:
Peak Hour Period Bus Tap On Time Weekday Morning Peak 6am to 9am Weekday Afternoon Peak 5pm to 8pm Weekend/Holiday Morning Peak 11am to 2pm Weekend/Holiday Evening Peak 4pm to 7pm Display the geographical distribution of the passenger trips using appropriate geovisualisation method.
Describe the spatial patterns revealed by the geovisualisation.
EITHER:
Local Indicators of Spatial Association (LISA) Analysis:
Compute LISA of the passengers trips generated by origin at the hexagon level.
Display the LISA maps of the passengers trips generated by origin at the hexagon level - for those that are statistically significant.
Draw statistical conclusions from the analysis results.
OR:
Emerging Hot Spot Analysis (EHSA) with reference to the passenger trips generated by origin at the hexagon level for the four time intervals explored in 1. above:
Perform Mann-Kendall Test using the spatio-temporal local Gi* values.
Prepare EHSA maps of the Gi* values of the passenger trips generated by origin at the hexagon level - for those that are statistically significant.
Describe the spatial patterns revealed with reference to the EHSA maps and data visualisation prepared.
Either 2. or 3. is required to be completed. I have attempted to complete both to obtain a more comprehensive understanding of the patterns in the data.
2 Getting Started
2.1 Setting the Analytical Tools
The R packages used in this take-home exercise are:
tidyverse (i.e. readr, tidyr, dplyr) for performing data science tasks such as importing, tidying, and wrangling data;
sf for importing, managing, and processing geospatial data;
sfdep for analysing spatial dependence and spatial relationships in data (building on spdep);
tmap for thematic mapping;
mapview for interactive viewing of spatial objects;
plotly for making interactive plots; and
knitr for embedding R code in different document formats (e.g., HTML) to facilitate dynamic report generation.
They are loaded into the R environment using the following code:
2.2 Data Sources
The Land Transport Authority (LTA) studies commuter movements using the data collected from the use of smart cards and the Global Positioning System (GPS) devices on public buses. The LTA DataMall shares some of these data publicly, which helps to speed up the development of practical solutions to enhance reliability and efficiency of the transport system by other stakeholders, such as the private sector and individuals.
The data sets used in this take-home exercise are:
Passenger Volume by Origin Destination Bus Stops from the LTA DataMall for the period of August 2023 to October 2023. There are three aspatial data sets (one for each month) in csv format. It contains data on the number of trips by weekdays and weekends from origin to destination bus stops.
Bus Stop Location from the LTA DataMall. This is a geospatial data set in ESRI shapefile format. It contains a point representation that indicates the position of each bus stop where buses stop to pick up or drop off passengers.
The data sets are placed under two sub-folders:
geospatial (Bus Stop Location), and
aspatial (Passenger Volume by Origin Destination Bus Stops).
These two sub-folders are within the data folder of my In-class_Ex1 folder.
3 Data Wrangling
3.1 Importing and Merging the Aspatial Data - Passenger Volume
The three csv files are imported and combined into a tibble data frame, odbus, using functions from the base package as shown in the following codes:
- Generate list of file names of csv files using the
list.files()function in the base package.
filenames = list.files(path="data/aspatial/", pattern="*.csv")
filenames[1] "origin_destination_bus_202308.csv" "origin_destination_bus_202309.csv"
[3] "origin_destination_bus_202310.csv"
- Generate path to csv files using the
file.path()function in the base package.
path = file.path("data/aspatial", filenames)
path[1] "data/aspatial/origin_destination_bus_202308.csv"
[2] "data/aspatial/origin_destination_bus_202309.csv"
[3] "data/aspatial/origin_destination_bus_202310.csv"
- Merge the three csv files using the
do.call()andlapply()functions in the base package and theread.csv()function from the readr package.
odbus = do.call("rbind", lapply(path, FUN=function(files){ read.csv(files)}))- Check using the
unique()function in the base package to confirm that the data sets for each of the three months have been incorporated. [Note: Although the three data sets have been merged into a single data frame, the observations would be analysed month by month, instead of aggregating all observations.]
unique(odbus$YEAR_MONTH)[1] "2023-08" "2023-09" "2023-10"
Once the combined data set has been obtained, the n_distinct() function in the dplyr package and the sum() function in the base package are applied to uncover the following observations regarding odbus:
Overall, there are 17,118,005 rows (observations) and 7 columns. 5,709,512, 5,714,196, and 5,694,297 rows (observations) from August, September, and October 2023 respectively, reflecting a generally stable total ridership over the three-month period.
The “YEAR_MONTH” column has three unique values, reflecting the observations from August, September, and October 2023.
The “DAY_TYPE” column has two unique values, reflecting observations from weekday or weekends/holiday.
The “TIME_PER_HOUR” has 24 unique values, reflecting that the observations are broken down into each of the 24 hours of a day.
The “PT_TYPE” column refers only has the value of “BUS” as the public transport type. Hence, it may be dropped.
The “ORIGIN_PT_CODE” and “DESTINATION_PT_CODE” columns have 5,075 and 5,079 unique values respectively, reflecting the number of bus stops with bus routes passing through them.
The “TOTAL_TRIPS” column contains values that reflect the number of passengers for each unique month, type of day, origin bus stop, destination bus stop. The minimum value is 1, i.e., there are no rows with zero values.
sapply(odbus, function(x) n_distinct(x)) YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE
3 2 24 1
ORIGIN_PT_CODE DESTINATION_PT_CODE TOTAL_TRIPS
5075 5079 4854
sum(odbus$YEAR_MONTH == "2023-08")[1] 5709512
sum(odbus$YEAR_MONTH == "2023-09")[1] 5714196
sum(odbus$YEAR_MONTH == "2023-10")[1] 5694297
3.2 Importing and Transforming the Geospatial Data - Bus Stop Location
The Bus Stop Location shapefile is imported using st_read() in the sf package. The output is a simple feature data frame, busstop, which is in the SVY21 projected coordinates systems. It has 5,161 features and 3 fields, which include the geometry points indicating the bus stop locations.
busstop = st_read(dsn = "data/geospatial",
layer = "BusStop")Reading layer `BusStop' from data source
`C:\jmphosis\ISSS624\Take-home_Ex\Take-home_Ex1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
glimpse(busstop)Rows: 5,161
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
The st_crs() function in the sf package is then used to check the coordinate system of the busstop simple feature data frame. The output shows that although it is projected in SVY21, the EPSG is indicated as 9001, which is incorrect given that it should be 3414 instead.
st_crs(busstop)Coordinate Reference System:
User input: SVY21
wkt:
PROJCRS["SVY21",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ID["EPSG",6326]],
PRIMEM["Greenwich",0,
ANGLEUNIT["Degree",0.0174532925199433]]],
CONVERSION["unnamed",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
To correct the EPSG, the st_set_crs() function in the sf package is applied. A check to confirm that the projection transformation has been applied is then made using the st_crs() function again.
busstop = st_set_crs(busstop, 3414)
st_crs(busstop)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
3.3 Checking for Duplicates and Missing Values
The data sets from the LTA DataMall are expected to be relatively clean. Nevertheless, due diligence checks for duplicates and missing values are still made to confirm the assumptions.
- The
duplicated()function in the base package is used to check for duplicates inodbus. There are no duplicates in the tibble data frame.
odbus[duplicated(odbus), ][1] YEAR_MONTH DAY_TYPE TIME_PER_HOUR
[4] PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
[7] TOTAL_TRIPS
<0 rows> (or 0-length row.names)
- The
duplicated()function in the base package andst_geometry()function in the sf package are used to check for duplicates inbusstop. The output returned Bus Stop No. 96319 at row 3265. On closer inspection of the simple feature data frame, rows 3264 and 3265 are found to be duplicates. Hence, the row 3264 is removed using therow_number()function in the dplyr package. The same check is conducted again to confirm that there are no more duplicates.
busstop[duplicated(st_geometry(busstop)), ]Simple feature collection with 1 feature and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 42187.23 ymin: 34995.78 xmax: 42187.23 ymax: 34995.78
Projected CRS: SVY21 / Singapore TM
BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
3265 96319 NIL YUSEN LOGISTICS POINT (42187.23 34995.78)
busstop = busstop %>%
filter(row_number() != 3264)
busstop[duplicated(st_geometry(busstop)), ]Simple feature collection with 0 features and 3 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] BUS_STOP_N BUS_ROOF_N LOC_DESC geometry
<0 rows> (or 0-length row.names)
- The
is.na()andsum()functions in the base package and thesummarise_all()function in the dplyr package are used to check for missing values inodbus. There are no missing values in the tibble data frame.
odbus %>%
summarise_all(~ sum(is.na(.))) YEAR_MONTH DAY_TYPE TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
1 0 0 0 0 0 0
TOTAL_TRIPS
1 0
- The
is.na()andsum()functions in the base package are used to check for missing values inbusstop. There are no missing values in the simple feature data frame.
sum(is.na(st_geometry(busstop)))[1] 0
3.4 Aggregating and Selecting Columns
Given that the analysis focuses on passenger volume by origin bus stop, the group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) by “ORIGIN_PT_CODE”, “YEAR_MONTH”, “DAY_TYPE”, and “TIME_PER_HOUR” in odbus. At the same time, the columns “DESTINATION_PT_CODE” and “PT_TYPE” are dropped as they are not required. The number of rows (observations) are condensed from 17,118,005 into 571,696.
odbus = odbus %>%
group_by(ORIGIN_PT_CODE, YEAR_MONTH, DAY_TYPE, TIME_PER_HOUR) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The select() function in dplyr package is used to drop the “BUS_ROOF_N” column in busstop since it is not required.
busstop = busstop %>%
select(-BUS_ROOF_N)3.5 Performing Relational Join
The simple feature data frame, busstop, and the tibble data frame, odbus, are combined using the inner_join() function in the dplyr package, by matching the “BUS_STOP_N” column in busstop with the “ORIGIN_PT_CODE” in odbus. However, as the “BUS_STOP_N” values are character values, they would need to be converted to integer values first before the join, so that values such as “01012” in busstop can be matched to values such as “1012” in odbus.
Upon joining the two data frames, the output is the simple feature data frame (since busstop was placed as the first argument in the inner_join() function), bus. The number of rows (observations) was reduced from 571,696 to 567,725, i.e., 3,841 rows are dropped. This is likely because some new bus stops such as Bus Stop No. 3549 (Shenton Way Stn Exit 3) and Bus Stop No. 96461 (Bef Aviation Pk Rd) were not included in the Bus Stop Location shapefile that was last updated in July 2023. This means that the analysis in this take-home exercise is limited in this aspect.
busstop = busstop %>%
mutate(BUS_STOP_N = as.integer(BUS_STOP_N))
bus = inner_join(busstop, odbus,
by = c("BUS_STOP_N" = "ORIGIN_PT_CODE"))3.6 Creating Spatial Hexagon Grid
Spatial grids are commonly used in spatial analysis. Regularly shaped grids may comprise of equilateral triangles, squares, or hexagons. The hexagon is the most circular-shaped polygon that can tessellate to form an evenly spaced grid, providing a low perimeter-to-area ratio that reduces sampling bias due to edge effects of the grid shape.
A more circular polygon means that points near the border are closer to the centroid. Hexagons are often used when the analysis involves aspects of connectivity or movement paths. Locating neighbours is simpler using a hexagon grid because the contact edge or length is consistent on each side, resulting in equidistant centroids for each neighbour.
A spatial hexagon grid, hgrid, for the study area of Singapore is created using the following codes:
area_hexagon_grid = st_make_grid(bus, c(500, 500), what = "polygons", square = FALSE)
hgrid = st_sf(area_hexagon_grid) %>%
mutate(grid_id = 1:length(lengths(area_hexagon_grid)))4 Exploratory Data Analysis - Computing, Visualising, and Deriving Insights on Passenger Trips by Origin
The same steps for 4.1 are repeated for the three other time periods studied in 4.2, 4.3, and 4.4.
4.1 Weekday Morning Peak (6am to 9.59am)
The same steps for 4.1.1 are repeated for the two other months studied in 4.1.2, and 4.1.3.
4.1.1 August 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdAM_Aug, for the passenger trips by origin during weekday morning peak periods in August 2023.
wdAMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdAMAug_ct = st_join(hgrid, wdAMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package. To facilitate comparisons between different peak periods, the breaks are set at fixed intervals of 50,000 passengers.
brk = c(1, 50000, 100000, 150000, 200000, 250000, 300000, 350000)tmap_mode("view")
map_wdAMAug = tm_shape(wdAMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Aug 2023")
map_wdAMAug4.1.2 September 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdAM_Sep, for the passenger trips by origin during weekday morning peak periods in September 2023.
wdAMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 8)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdAMSep_ct = st_join(hgrid, wdAMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wdAMSep = tm_shape(wdAMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Sep 2023")
map_wdAMSep4.1.3 October 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdAM_Oct, for the passenger trips by origin during weekday morning peak periods in October 2023.
wdAMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 8)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdAMOct_ct = st_join(hgrid, wdAMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wdAMOct = tm_shape(wdAMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Morning Peak in Oct 2023")
map_wdAMOct4.1.4 Insights
xxx
4.2 Weekday Afternoon Peak (5pm to 7.59pm)
The same steps for 4.2.1 are repeated for the two other months studied in 4.2.2, and 4.2.3.
4.2.1 August 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdPM_Aug, for the passenger trips by origin during weekday afternoon peak periods in August 2023.
wdPMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 19)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdPMAug_ct = st_join(hgrid, wdPMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wdPMAug = tm_shape(wdPMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Aug 2023")
map_wdPMAug4.2.2 September 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdPM_Sep, for the passenger trips by origin during weekday afternoon peak periods in September 2023.
wdPMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 19)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdPMSep_ct = st_join(hgrid, wdPMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wdPMSep = tm_shape(wdPMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Sep 2023")
map_wdPMSep4.2.3 October 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wdPM_Oct, for the passenger trips by origin during weekday afternoon peak periods in October 2023.
wdPMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 19)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wdPMOct_ct = st_join(hgrid, wdPMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wdPMOct = tm_shape(wdPMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekday Afternoon Peak in Oct 2023")
map_wdPMOct4.2.4 Insights
xxx
4.3 Weekend/Holiday Morning Peak (11am to 1.59pm)
The same steps for 4.3.1 are repeated for the two other months studied in 4.3.2, and 4.3.3.
4.3.1 August 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, weAM_Aug, for the passenger trips by origin during weekend/holiday morning peak periods in August 2023.
weAMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 13)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
weAMAug_ct = st_join(hgrid, weAMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_weAMAug = tm_shape(weAMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Aug 2023")
map_weAMAug4.3.2 September 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, weAM_Sep, for the passenger trips by origin during weekend/holiday morning peak periods in September 2023.
weAMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 13)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
weAMSep_ct = st_join(hgrid, weAMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_weAMSep = tm_shape(weAMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Sep 2023")
map_weAMSep4.3.3 October 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, weAM_Oct, for the passenger trips by origin during weekend/holiday morning peak periods in October 2023.
weAMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 13)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
weAMOct_ct = st_join(hgrid, weAMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_weAMOct = tm_shape(weAMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Morning Peak in Oct 2023")
map_weAMOct4.3.4 Insights
xxx
4.4 Weekend/Holiday Evening Peak (4pm to 6.59pm)
The same steps for 4.4.1 are repeated for the two other months studied in 4.4.2, and 4.4.3.
4.4.1 August 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wePM_Aug, for the passenger trips by origin during weekend/holiday evening peak periods in August 2023.
wePMAug = bus %>%
filter(YEAR_MONTH == "2023-08") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 18)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wePMAug_ct = st_join(hgrid, wePMAug) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wePMAug = tm_shape(wePMAug_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Aug 2023")
map_wePMAug4.4.2 September 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wePM_Sep, for the passenger trips by origin during weekend/holiday evening peak periods in September 2023.
wePMSep = bus %>%
filter(YEAR_MONTH == "2023-09") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 18)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wePMSep_ct = st_join(hgrid, wePMSep) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wePMSep = tm_shape(wePMSep_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Sep 2023")
map_wePMSep4.4.3 October 2023
The filter() function in the dplyr package is used to obtain a subset of the simple feature data frame, wePM_Oct, for the passenger trips by origin during weekend/holiday evening peak periods in October 2023.
wePMOct = bus %>%
filter(YEAR_MONTH == "2023-10") %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 18)The st_join() function in the sf package and the drop.na() function in the tidyr package are used to match the bus stops points with the hexagons in the grid. The group_by() and summarise() functions in the dplyr package are used to sum the “TOTAL_TRIPS” values (i.e., total passengers) for each hexagon in the grid.
wePMOct_ct = st_join(hgrid, wePMOct) %>%
drop_na() %>%
group_by(grid_id) %>%
summarise(TOTAL_TRIPS = sum(TOTAL_TRIPS))The output is then plotted use functions in the tmap package.
tmap_mode("view")
map_wePMOct = tm_shape(wePMOct_ct) +
tm_fill(col = "TOTAL_TRIPS",
palette = "Reds",
breaks = brk,
title = "No. of Passenger Trips",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("No. of Passenger Trips: " = "TOTAL_TRIPS"),
popup.format = list(TOTAL_TRIPS = list(format = "f", digits = 0))) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "Weekend/Holiday Evening Peak in Oct 2023")
map_wePMOct4.4.4 Insights
xxx
4.5 Comparison Between Different Peak Periods
4.5.1 August 2023
Weekday AM versus Weekday PM:
tmap_arrange(map_wdAMAug, map_wdPMAug, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(map_weAMAug, map_wePMAug, ncol = 2)4.5.2 September 2023
Weekday AM versus Weekday PM:
tmap_arrange(map_wdAMSep, map_wdPMSep, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(map_weAMSep, map_wePMSep, ncol = 2)4.5.3 October 2023
Weekday AM versus Weekday PM:
tmap_arrange(map_wdAMOct, map_wdPMOct, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(map_weAMOct, map_wePMOct, ncol = 2)4.5.4 Insights
xxx
5 Local Indicators of Spatial Association (LISA) Analysis
Local Indicators of Spatial Association (LISA) are statistics that evaluate the existence of clusters in the spatial arrangement of a given variable. The appropriate LISA, including local Moran’s I, are applied to detect cluster(s) and/or outlier(s) for the Bus Passenger Trips in the bus simple feature data frame, month by month.
5.1 Computing Distance-based Weights
Distance-based weights are used in this take-home exercise. Distance-based weights are more appropriate than contiguity spatial weights in this context given that the hexagons are spread out across Singapore, and some hexagons may not have immediate neighbours (i.e., contiguous) due to reasons such as being separated by reservoirs or waterways.
There are three popularly used distance-based spatial weights, they are: fixed distance weights, adaptive distance weights, and inverse distance weights (IDW). For this take-home exercise, a combination of IDW and adaptive distance weights are used. IDW assigns higher weights to neighbours that are closer and lower weights to neighbours that are further away, while adaptive distance weights balances out the number of neighbours for each hexagon regardless of how densely populated its surrounding areas is (i.e., density of nearby bus stops).
The
st_knn()function in the sfdep package is used to identify neighbors based on k (i.e. k = 8 indicates the nearest eight neighbours). The output is a list of neighbours (i.e.nb).The
st_inverse_distance()function in the sfdep package is then used to calculate inverse distance weights of neighbours in thenb.
For Weekday Morning Peak:
wdAMAug_wt = wdAMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdAMSep_wt = wdAMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdAMOct_wt = wdAMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekday Afternoon Peak:
wdPMAug_wt = wdPMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdPMSep_wt = wdPMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wdPMOct_wt = wdPMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekend/Holiday Morning Peak:
weAMAug_wt = weAMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
weAMSep_wt = weAMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
weAMOct_wt = weAMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)For Weekend/Holiday Evening Peak:
wePMAug_wt = wePMAug_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wePMSep_wt = wePMSep_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)
wePMOct_wt = wePMOct_ct %>%
mutate(nb = st_knn(area_hexagon_grid,
k=8),
wt = st_inverse_distance(nb,
area_hexagon_grid,
scale = 1,
alpha = 1),
.before = 1)5.2 Computing Local Indicators of Spatial Association (LISA) of Passenger Trips by Origin
The local_moran() function in the sfdep package is used to compute the local Moran’s I value of “TOTAL_TRIPS” at the hexagon level. The output is a tibble data frame, lisa.
- The output is a simple feature data frame containing the columns ii, eii, var_ii, z_ii, p_ii, p_ii_sim, and p_folded_sim.
ii: Local moran statistic.
eii: Expectation of local moran statistic; for local_moran_perm(), the permutation sample means.
var_ii: Variance of local moran statistic; for local_moran_perm(), the permutation sample standard deviations.
z_ii: Standard deviation of local moran statistic; for local_moran_perm(), based on permutation of sample means and standard deviations
p_ii: p-value of local moran statistic using pnorm(); for local_moran_perm(), using standard deviation based on permutation of sample means and standard deviations
p_ii_sim: For local_moran_perm(), rank() and punif() of observed statistic rank for [0, 1] p-values using “alternative=”
p_folded_sim: The simulation folded [0, 0.5] range ranked p-value.
skewness: For local_moran_perm(), the output of e1071::skewness() for the permutation samples underlying the standard deviates
kurtosis: For local_moran_perm(), the output of e1071::kurtosis() for the permutation samples underlying the standard deviates.
- The
unnest()function in the tidyr package is used to expand a list-column containing data frames into rows and columns. - In terms of interpretation:
- A positive local Moran’s I value indicates a hexagon close to similar values (High-High or Low-Low);
- A negative value indicates a hexagon close to dissimilar values (High-Low or Low-High); and
- A value near zero suggests no significant local spatial autocorrelation.
For Weekday Morning Peak:
wdAMAug_lisa = wdAMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdAMSep_lisa = wdAMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdAMOct_lisa = wdAMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekday Afternoon Peak:
wdPMAug_lisa = wdPMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdPMSep_lisa = wdPMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wdPMOct_lisa = wdPMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekend/Holiday Morning Peak:
weAMAug_lisa = weAMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
weAMSep_lisa = weAMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
weAMOct_lisa = weAMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)For Weekend/Holiday Evening Peak:
wePMAug_lisa = wePMAug_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wePMSep_lisa = wePMSep_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
wePMOct_lisa = wePMOct_wt %>%
mutate(local_moran = local_moran(
TOTAL_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)5.3 Creating Local Indicators of Spatial Association (LISA) Maps of Passenger Trips by Origin
The LISA map is a categorical map showing outliers and clusters. There are two types of outliers namely: High-Low and Low-High outliers. Likewise, there are two type of clusters namely: High-High and Low-Low clusters. The LISA map is an interpreted map combining local Moran’s I of geographical areas and their respective p-values.
For Weekday Morning Peak:
wdAMAug_lisasig = wdAMAug_lisa %>%
filter(p_ii_sim < 0.05)
wdAMAug_lisamap = tm_shape(wdAMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Aug 2023")
wdAMSep_lisasig = wdAMSep_lisa %>%
filter(p_ii_sim < 0.05)
wdAMSep_lisamap = tm_shape(wdAMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Sep 2023")
wdAMOct_lisasig = wdAMOct_lisa %>%
filter(p_ii_sim < 0.05)
wdAMOct_lisamap = tm_shape(wdAMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Morning Peak in Oct 2023")For Weekday Afternoon Peak:
wdPMAug_lisasig = wdPMAug_lisa %>%
filter(p_ii_sim < 0.05)
wdPMAug_lisamap = tm_shape(wdPMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Aug 2023")
wdPMSep_lisasig = wdPMSep_lisa %>%
filter(p_ii_sim < 0.05)
wdPMSep_lisamap = tm_shape(wdPMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Sep 2023")
wdPMOct_lisasig = wdPMOct_lisa %>%
filter(p_ii_sim < 0.05)
wdPMOct_lisamap = tm_shape(wdPMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekday Afternoon Peak in Oct 2023")For Weekend/Holiday Morning Peak:
weAMAug_lisasig = weAMAug_lisa %>%
filter(p_ii_sim < 0.05)
weAMAug_lisamap = tm_shape(weAMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Aug 2023")
weAMSep_lisasig = weAMSep_lisa %>%
filter(p_ii_sim < 0.05)
weAMSep_lisamap = tm_shape(weAMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Sep 2023")
weAMOct_lisasig = weAMOct_lisa %>%
filter(p_ii_sim < 0.05)
weAMOct_lisamap = tm_shape(weAMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Morning Peak in Oct 2023")For Weekend/Holiday Evening Peak:
wePMAug_lisasig = wePMAug_lisa %>%
filter(p_ii_sim < 0.05)
wePMAug_lisamap = tm_shape(wePMAug_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Aug 2023")
wePMSep_lisasig = wePMSep_lisa %>%
filter(p_ii_sim < 0.05)
wePMSep_lisamap = tm_shape(wePMSep_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Sep 2023")
wePMOct_lisasig = wePMOct_lisa %>%
filter(p_ii_sim < 0.05)
wePMOct_lisamap = tm_shape(wePMOct_lisasig) +
tm_fill(col = "mean",
title = "Cluster/Outlier",
id = "grid_id",
showNA = FALSE,
alpha = 0.5,
popup.vars = c("Cluster/Outlier: " = "mean")) +
tm_borders(col = "grey40", lwd = 0.7) +
tm_layout(title = "LISA Map of Weekend/Holiday Evening Peak in Oct 2023")5.4 Comparison Between Different Peak Periods
5.4.1 August 2023
Weekday AM versus Weekday PM:
tmap_arrange(wdAMAug_lisamap, wdPMAug_lisamap, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(weAMAug_lisamap, wePMAug_lisamap, ncol = 2)5.4.2 September 2023
Weekday AM versus Weekday PM:
tmap_arrange(wdAMSep_lisamap, wdPMSep_lisamap, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(weAMSep_lisamap, wePMSep_lisamap, ncol = 2)5.4.3 October 2023
Weekday AM versus Weekday PM:
tmap_arrange(wdAMOct_lisamap, wdPMOct_lisamap, ncol = 2)Weekend/Holiday AM versus Weekend/Holiday PM:
tmap_arrange(weAMOct_lisamap, wePMOct_lisamap, ncol = 2)